Determination of X-Ray Transient Source Positions 
By Bayesian Analysis of Coded Aperture Data 
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Abstract 



We present a new method of transient point source dcconvolution for coded- aperture X-Ray detectors. 
Our method is based upon the calculation of the likelihood function and its interpretation as a probability 
density for the transient source position by an application of Bayes' Theorem. The method obtains point 
estimates of source positions by finding the maximum of this probability density, and interval estimates of 
prescribed probability by choosing suitable contours of constant probability density. We give the results 
of simulations that we performed to test the method. We also derive approximate analytic expressions for 
the predicted performance of the method. These estimates underline the intuitively plausible properties 
of the method and provide a sound quantitative basis for the design of coded-aperture systems. 
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1. Introduction 

The nature and origin of gamma-ray bursts is a mys- 
tery of twenty-five years' standing. The identification of 
counterparts at other wavelengths is the "Holy Grail" of 
GRB studies, since it is widely believed that this is the 
most likely approach to crack the mystery. The consen- 
sus is that the next step in GRB instrumentation is to 
obtain few-arc-minute or better scale GRB sky locations 
in bulk, possibly in real time. This can be accomplished 
effectively by means of a triggered coded-aperture X-ray 
instrument such as the one planned for HETE (Ricker 
1997). 

Several processing methods for coded-aperture data 
have been proposed, including cross correlation (Feni- 
more 1978), least-squares fitting (Doty 1978), and Max- 
imum Entropy (Sims et al. 1980, Willingdale et al. 
1984). Skinner and Nottingham (1993) have described a 
maximum-likelihood fitting technique. We present here a 
Bayesian scheme for analyzing coded-aperture data from 
such transient events. The method is based on the calcu- 
lation of the joint likelihood function for two stretches of 
data: the stretch covering the transient event itself, and 
a stretch before and/or afterwards, which provides infor- 
mation about the background. We interpret the likeli- 
hood thus obtained as a posterior probability density for 
the transient event location by an application of Bayes' 
Theorem (for a lucid discussion of astrophysical applica- 
tions of Bayesian inference, see Loredo 1992). 

This probability density has several uses. Its max- 



imum provides a point estimate for the location of the 
transient source. We can also use it to obtain a 68% cred- 
ible region for the location of the source. Finally, we can 
use semi-analytical approximations to this probability 
distribution to predict the performance of the method, 
and to state detector design criteria useful for optimizing 
the angular resolution of the instrument. 



2. The Posterior Probability Density 

We begin by exhibiting the posterior probability density 
for the transient position. The discussion in this section 
is rather similar to the discussion in Loredo (1992) of 
Bayesian inference of a Poisson mean, which may usefully 
be read for comparison. 

We assume that a (one- or two-dimensional) coded- 
aperture instrument is illuminated by a transient source. 
The instrument consists of a position-sensitive detector 
with iVdet position bins, beneath a coded-aperture mask. 
We denote the "background" counts observed during a 
period Tbk prior to the onset of the transient event by 
b, where the ith component of b is bi, i — I, ... , N^ct- 
We denote by g the gross counts observed during a time 
Tburst while the transient event was occurring. The back- 
ground b includes the diffuse X-ray background and the 
particle background, as well as any steady point sources 
in the field of view. The gross counts g reflect the shadow 
pattern of the coded-aperture mask on the detector as 
cast by the illumination of the transient source, super- 
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posed upon the previously measured steady background. 

We denote by SI = (0, 0) the direction towards the 
transient source. We denote the characteristic shadow 
pattern of the mask iUuminated by a source in the di- 
rection f2 by f(Sl), where we adopt the normahzation 
SfcT Ti = 1. We further denote the expected number 
of total counts due to the transient by w, so that the 
expected number of counts in the ith bin due to the 
transient source is LUTi. 

If we assume prior probability densities that are uni- 
form in the background Poisson rates, in do;, and in 
dD. = d cos 9 d(j), and integrate over the unknown back- 
ground rates and over ui, we obtain the following expres- 
sion for the desired posterior probability density: 

/>OC 

P{n\g,b,I) = K / due^'^ X 
Jo 

iW^JMKg, - s,)\ (l + Tbur«t/rbk)^'+^'+i J ' ^ ' 

where k is a normalization constant. If instead of inte- 
grating over the unknown background rates, we assume 
that Tbk is sufficiently long that the observation of the 
bi determines the rates accurately, we find instead the 
simpler approximate formula 

/>oo 

P{n\g,b,I) « K / duie-'^ X 
Jo 

Yl[u;n + {n,,,st/ni,)b,]a'. (2) 

i=l 

In these equations, the Ti{il) represent our under- 
standing of the detector response. We may estimate 
them by simple ray-tracing of the mask pattern onto the 
detector, or they may be determined by Monte Carlo 
simulations that account for physical effects (Compton 
scattering by various elements of the spacecraft and in- 
strument, finite detector resolution, blurring due to the 
finite thickness of the coded aperture mask, etc.) as well 
as purely geometric effects. 

The procedure we have used to arrive at equations 
(1) and (2) may be described as a joint fit of the back- 
ground and event data, followed by an integration over 
the unknown background Poisson means, which are un- 
interesting parameters. While this procedure may seem 
unfamiliar, it is merely the Poisson version of the famil- 
iar Gaussian technique of background subtraction. In 
fact, if the data consisted of Gaussian rather than Pois- 
son deviates, with the hkelihood proportional to 
we would be led directly to the usual formula for the 
of a background-subtracted signal. To press the analogy 
further, the passage from eq. (|l|) to eq. (||) is analogous 
to the neglect in Gaussian theory of the contribution to 



the variance due to the uncertainty in the measurement 
of the background. 

Thus, this method is related to the "x^-fitting ap- 
proach" to coded-aperture data analysis (Doty 1978) in 
the sense that, to within an additive constant, is pro- 
portional to the log likelihood in the Gaussian limit of 
many counts. In fact, the likelihood method constitutes 
a generalization of the method which is robust even 
in the limit of low signal-to-noise. 

We may use eqs. (|^) or (||) (depending on the back- 
ground rate and on Tbk) in two ways. We may estimate 
the location of the transient on the sky by maximizing 
the probability density with respect to il, obtaining in 
effect the the maximum likelihood estimate of the loca- 
tion. We may also use the probability density to obtain, 
say, a 68% probability Bayesian confidence region for the 
location of the transient. We choose a grid of points that 
is uniform in an equal- area projection of the field of view 
near the estimated location of the transient, and calcu- 
late the probability density at each point of the grid. We 
choose K so that the sum over the grid is normalized to 
1, and find the value of the probability density such that 
the sum of the probabilities of grid points that exceed 
this value is 68%. 

3. Simulations 

3.1. Signal-to-Noise Study 

For illustrative and pedagogical purposes, we have simu- 
lated a highly idealized coded-aperture instrument pat- 
terned after the HETE WXM. This detector consists of 
two crossed, one-dimensional position-sensitive propor- 
tional counters (PSPCs), one each in the x and y direc- 
tions. The length of a PSPC bin is 0.1 cm. At a distance 
18.73 cm above each PSPC is a one-dimensional random 
array coded-aperture mask. The length of a mask ele- 
ment is 0.2 cm. Note that the "natural" resolution unit 
for this ideal detector is the ratio of the bin size to the 
height of the mask above the PSPC, i.e. ~ 18'. 

We model the using only geometric shadowing by 
the coded aperture mask and the detector walls, and 
projection effects; no physical effects (such as the ones 
alluded to in the previous section) are included. The 
encoded images obtained during the intervals Tburst and 
Tbk are thus degraded by Poisson noise only. 

Figures and |^ show the result of analyzing simula- 
tions at {6 — 30°, = 45°), for signal-to-noise ratios 
(S/N) of 10, 5, 3, and 1. The probability density was 
calculated using eq. (^. The solid, dashed, and dotted 
contours represent 68.3%, 95.5%, and 99.7% Bayesian 
confidence regions, respectively. 

The figures show that for high S/N, the angular reso- 
lution can be considerably better than the "natural" res- 
olution, while the instrument's angular resolving power 
starts to "melt down" at about S/N=:3. Note that the 
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Fig. 1. Analysis of simulated data assuming a transient event from the direction 9 = 30°, <p = 45°. The plots on the left show the probability 
density, while the plots on the right show the 1-, 2-, and 3-o" contours for the transient location, as well as the true location (dot) and 
the maximum-likelihood point (cross). The X and Y axes are cartesian coordinates in an equal-area projection, shifted so as to place the 
true event location at the origin. Top panel: S/N=10. Bottom panel: S/N=5. 



S/N is for the entire event, not just for the trigger inter- 
val, so that even events discovered near or below a 3-ct 
trigger threshold may be successfully located. 

3.2. Contour Calibration Study 

By construction, the contours chosen by this procedure 
have the following property: If in simulations we choose 
locations from a distribution of locations that is uniform 
in the FOV, and for each location we simulate data and 
calculate a credible region, in the long run the credible 
region will bracket the "true" location of the simulated 
transient 68% of the time. 

We tested this property by simulating many transients 



events. The position of each event was drawn randomly 
from a uniform distribution on the sky, limited to an 
angle from the detector normal of less than 20°. Each 
transient event was superposed on a background of 1 
count s^^ bin^^, and was assigned S/N=3 for the entire 
detector. The background stretch lasted 100 sec for each 
event. 

Note that from the plots in Figure H, S/N=:3 is about 
where the linearity of the model in the position param- 
eters begins to break down, so that the posterior den- 
sity looks significantly non-Gaussian. This is the regime 
where we would expect to lose faith in the correctness of 
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Fig. 2. Analysis of simulated data assuming a transient event from the direction = 30°, (p = 45°. The plots on the left show the probability 
density, while the plots on the right show the 1-, 2-, and 3-<t contours for the transient location, as well as the true location (dot) and 
the maximum-likelihood point (cross). The X and Y axes are cartesian coordinates in an equal-area projection, shifted so as to place the 
true event location at the origin. Top panel: S/N=10. Top panel: S/N=3. Bottom panel: S/N=l. 



the calibration of contours obtained using the proce- 
dure of Lampton, Margon, and Bowyer (1976). 

We simulated 1000 events, in each case calculating 
the 68.3%, 95.5%, and 99.7% contours and recording 
whether the contours included the "true" position. The 
number of such "hits" is binomially distributed, so that 
if the contours of probability value p bracket the true 
position b times in n attempts, we may use as a sta- 
tistical measure of the plausibility of this resTilt the cu- 
mulative distribution function Q{b) = X)j=o '^■/jK^ ~ 
j)\ X p>{l — Q{b) should be approximately uni- 

formly distributed in the interval [0, 1], so the result will 



be plausible unless Q{b) is found to be excessively close 
to either or 1. 

Out of 1000 simulated events, we found that the 68.3% 
contour bracketed the true event position 675 times (Q = 
0.30), the 95.5% contour did so 956 times (Q = 0.58), 
and the 99.7% contour did so 996 times {Q = 0.35). 
Thus the contours produced by the method appear to 
be well-calibrated (as expected), in the sense that the 
long-term frequency with which the contours bracket the 
true positions are indeed consistent with their nominal 
probability values. 



4. Semi-Analytic Resolution Estimate 

We now introduce some analytical approximations in or- 
der to develop some intuition regarding the method de- 
scribed in the previous section. These approximate es- 
timates are helpful for optimization of detector parame- 
ters, and also allow us to verify that our simulations of 
the method, described in the previous section, behave as 
expected. 

For concreteness, we specialize our analysis to coded- 
aperture systems similar to the HETE WXM. This is 
not a very serious restriction, since other coded-aperture 
detectors (such as the ASM on the Rossi X-Ray Timing 
Explorer) have fairly similar designs. In any event, gen- 
eralization to other types of coded-aperture detectors is 
straightforward . 

The HETE WXM consists of two crossed, one- 
dimensional PSPCs, one each in the x and y directions. 
Above each PSPC is a one-dimensional coded-aperture 
random array mask. 

Let the PSPC bin the data in A^dot position bins of 
length A, and let A^dct x A = Ldct- Also, assume that 
the mask consists of Nmask adjacent elements of length I, 
each of which may be open or closed, and define A'mask x 
I = imask- For HETE, L^ask > Ldot, and / > A, and we 
assume this is so in the present analysis. Let h be the 
height of the mask above the PSPC. We further define 
the masks's open fraction t, which is such that tA^mask 
is equal to the number of open mask elements. Finally, 
we assume that the spatial resolution of the PSPC is 
described by a smearing function ga{x — y), which is the 
probability that a photon deposited at a position y on 
the PSPC is recorded with a position x in range dx. It 
is assumed to have a width a. Finally we define if) as 
the expected number of transient source counts per bin 
assuming a. t — 1 mask, and 77 as the expected number 
of background counts per bin assuming a. t — 1 mask. 

Assuming the direction of the transient is not too far 
from the detector normal, we may evaluate the detector 
resolution by estimating the width of the central peak of 
the probability density, in the form given in eq. (^). 
The angular variation of the probability density may 
be traced to the variation with the assumed transient 
source position of the shadows of mask element edges on 
the detector. The angular resolution thus increases with 
increasing number of edge shadows on the detector, as 
well as with increasing sharpness of the edge shadows. 
Guided by this observation we have shown that the de- 
tector angular resolution 59 is approximately given by 
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and where F{9) is the fraction of the detector that is not 
shadowed by the detector walls for a transient source at 
9. For the 1000 simulations discussed in the previous 
section, we find that this expression, translated into a 
confidence region, is in good agreement (10%-20%) with 
the actual confidence region sizes. 

Equation (||) may be readily interpreted as follows: 

As is intuitively reasonable, 59 cx A/h. Thus the angu- 
lar resolution is naturally expressed in units of the angle 
subtended at the mask by a detector bin size. It follows 
that it is desirable to design the coded-aperture system 
with as small a A/h as possible. Of course, increasing 
h reduces the field of view and increases the penalty im- 
posed by the factor F{9)~^/'^ for off-axis events. 

Furthermore, 59 oc (Ldct/O^^^^i which accounts for 
the "tiling" of the detector by the mask elements — this 
factor in effect counts the number of mask element edges 
that cast their shadows on the detector. Clearly, it places 
a premium on as small an I as possible, that is, I = A. 
Thus it appears that "oversampling" of the mask by the 
detector is not beneficial. 

When -0 3> ry, we have 59 (x 1/0^^^, while when <^ 
77, we have 59 oc -q^/'^/ij}. Thus in either case, 59 oc 
1/SNR. 

The dependence of 59 on t is relatively simple. There 
is an "optimal" topt that minimizes 59, given by 
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Fig. 4. Optimum transmission fraction as a function of sig- 
nal-to-background. 

The behavior of topt as a function of V'/Ty is shown in Fig- 
ure ^. The form of topt is different from the one found 
by previous authors (Fenimore 1978, in't Zand, Heise, & 
Jager 1994), although the functional dependence is still 
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Fig. 3. Angular resolution as a function of transmission fraction t, for i/i/»)=0.1, 1, and 10. The function f{t) is given by [{l+i/)/2r;t)/(l— t)] ^/■^ . 
The quantity topt is the optimunn transmission fraction for the given signal-to-background. 



on the signal-to-background. The difference is ascrib- 
able to the difference in the quantities optimized: the 
previous work optimized the signal-to-noise in the reso- 
lution element corresponding to the source, whereas in 
this work we optimize the angular resolution directly. 

Figure |^ shows the t dependence of S9 for signal-to- 
background ratios of 0.1, 1, and 10. It is clear from this 
figure that precisely optimizing t is not critical. The 
^-dependence of the resolution is a very forgiving func- 
tion, with relatively large differences from topt costing 
relatively little in resolution. 

Finally, the factor Sa accounts for the smearing by the 
detector of the otherwise perfectly sharp shadows of the 
mask element edges. It is the "migration" with assumed 
source position of these shadows across the PSPC bins 
that gives the probability density its sensitivity to the 
transient location, and which makes possible resolutions 
69 < A/h. Naturally, the linear resolution of the PSPC 
degrades the angular resolution of the instrument, by a 
factor given by Sa- Note that 5(j ^ 1 as a ^ 0. 

5. Conclusions 

The likelihood function approach to coded-aperture data 
analysis is a very powerful method that makes maximal 
use of the information borne by the data. The method 
produces not only point estimates of source position, but 
also credible regions that are well-calibrated in the sense 
that in the long run, a 68% region brackets the true lo- 
cation in 68% of simulations. This remains the case even 
in the low signal-to- noise case, where linear methods lose 
their statistical reliability. The method can produce an- 
gular resolution in excess of the "natural" resolution of 
the detector, for high signal-to-noise ratios. 

In addition, a semi-analytic estimate of the angular 
resolution gives a result that is intuitively plausible and 
provides useful guidance for coded-aperture instrument 
design. In particular it provides a novel criterion for 
assessing the impact of the choice of transmission frac- 
tion of the coded-aperture mask, and indicates that it is 



not beneficial to design a position-sensitive detector that 
"oversamples" the mask pattern. 
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